Conditional Random Fields, Planted Satisfaction, and Entropy Concentration

نویسندگان

  • Emmanuel Abbe
  • Andrea Montanari
چکیده

This paper studies a class of probabilistic models on graphs, where edge variables depend on incident node variables through a fixed probability kernel. The class includes planted constraint satisfaction problems (CSPs), as well as more general structures motivated by coding and community clustering problems. It is shown that under mild assumptions on the kernel, the conditional entropy of the node variables given the edge variables concentrates around a deterministic threshold. This implies in particular the concentration of the number of solutions in a broad class of planted CSPs, the existence of a threshold function for the disassortative stochastic block model, and the proof of a conjecture posed by [RKPSS10] on parity check codes. It also establishes new connections among coding, clustering and satisfiability. ∗ Department of Electrical Engineering and Program in Applied and Computational Mathematics, Princeton University, Email: [email protected] † Departments of Electrical Engineering and Statistics, Stanford University, Email: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conditional Random Fields, Planted Constraint Satisfaction and Entropy Concentration

This paper studies a class of probabilistic models on graphs, where edge variables depend on incident node variables through a fixed probability kernel. The class includes planted constraint satisfaction problems (CSPs), as well as more general structures motivated by coding and community clustering problems. It is shown that under mild assumptions on the kernel and for sparse random graphs, th...

متن کامل

Combination of Machine Learning Methods for Optimum Chinese Word Segmentation

This article presents our recent work for participation in the Second International Chinese Word Segmentation Bakeoff. Our system performs two procedures: Out-ofvocabulary extraction and word segmentation. We compose three out-of-vocabulary extraction modules: Character-based tagging with different classifiers – maximum entropy, support vector machines, and conditional random fields. We also co...

متن کامل

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

We present conditional random fields , a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum e...

متن کامل

A Preferred Definition of Conditional Rényi Entropy

The Rényi entropy is a generalization of Shannon entropy to a one-parameter family of entropies. Tsallis entropy too is a generalization of Shannon entropy. The measure for Tsallis entropy is non-logarithmic. After the introduction of Shannon entropy , the conditional Shannon entropy was derived and its properties became known. Also, for Tsallis entropy, the conditional entropy was introduced a...

متن کامل

Shallow Parsing with Conditional Random Fields

Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets and extensive comparison among methods. We show here how to train a conditional random fiel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1305.4274  شماره 

صفحات  -

تاریخ انتشار 2013